Whole genome sequence analysis of the simulated systolic blood pressure in Genetic Analysis Workshop 18 family data: long-term average and collapsing methods
نویسندگان
چکیده
Analysis of longitudinal family data is challenging because of 2 sources of correlations: correlations across longitudinal measurements and correlations among related individuals. We investigated whether analysis using long-term average (average of all 3 visits) can enhance gene discovery compared with a single-visit analysis. We analyzed all 200 replicates of simulated systolic blood pressure (SBP) in Genetic Analysis Workshop 18 (GAW18) family data using both single-marker and collapsing methods. We considered 2 collapsing approaches: collapsing all variants and collapsing low-frequency variants. Analysis using long-term average performed slightly better than SBP measured at a single visit. Collapsing all variants performed much better than collapsing low-frequency variants at MAP4 and FLNB, which included a common variant with a relatively large effect. For several variants in gene MAP4, single-marker analysis also provided high power. In contrast, collapsing only low-frequency variants performed much better for SCAP, DNASE1L3, and LOC152217, where rare variants in these genes had larger effect than common variants. However, for other causal variants, all approaches provided disappointingly poor performance. This poor performance appeared to occur because most of these causal variants explained a very small fraction of phenotypic variance. We also found that collapsing multiple variants did worse than single-marker analysis for several genes when they contained causal single-nucleotide polymorphisms (SNPs) with both positive and negative effects. Because half of causal SNPs were not found in the annotation file based on the 1000 Genomes Project, we found that power was also affected by our use of incomplete annotation information.
منابع مشابه
Comparison of multilevel modeling and the family-based association test for identifying genetic variants associated with systolic and diastolic blood pressure using Genetic Analysis Workshop 18 simulated data
Identifying genetic variants associated with complex diseases is an important task in genetic research. Although association studies based on unrelated individuals (ie, case-control genome-wide association studies) have successfully identified common single-nucleotide polymorphisms for many complex diseases, these studies are not so likely to identify rare genetic variants. In contrast, family-...
متن کاملData for Genetic Analysis Workshop 18: human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees
Genetic Analysis Workshop 18 (GAW18) focused on identification of genes and functional variants that influence complex phenotypes in human sequence data. Data for the workshop were donated by the T2D-GENES Consortium and included whole genome sequences for odd-numbered autosomes in 464 key individuals selected from 20 Mexican American families, a dense set of single-nucleotide polymorphisms in ...
متن کاملOmics-squared: human genomic, transcriptomic and phenotypic data for genetic analysis workshop 19
BACKGROUND The Genetic Analysis Workshops (GAW) are a forum for development, testing, and comparison of statistical genetic methods and software. Each contribution to the workshop includes an application to a specified data set. Here we describe the data distributed for GAW19, which focused on analysis of human genomic and transcriptomic data. METHODS GAW19 data were donated by the T2D-GENES ...
متن کاملAdjustment of familial relatedness in association test for rare variants
High-throughput sequencing technology allows researchers to test associations between phenotypes and all the variants identified throughout the genome, and is especially useful for analyzing rare variants. However, the statistical power to identify phenotype-associated rare variants is very low with typical genome-wide association studies because of their low allele frequencies among unrelated ...
متن کاملAnalysis of baseline, average, and longitudinally measured blood pressure data using linear mixed models
This article compares baseline, average, and longitudinal data analysis methods for identifying genetic variants in genome-wide association study using the Genetic Analysis Workshop 18 data. We apply methods that include (a) linear mixed models with baseline measures, (b) random intercept linear mixed models with mean measures outcome, and (c) random intercept linear mixed models with longitudi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 8 شماره
صفحات -
تاریخ انتشار 2014